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ABSTRACT 

This document is a guide for use by the practical 
researcher concerned .with conducting studies of transfer of learning 
from pretraining of pilots in ground-based environments to 
performance in aircraft. While the material addresses principally 
transfer of learning of pilots, many of the issues should be 
applicable to other contexts, to include training of other aircrew 
members or even individuals who have quite different tasks to 
perform. The paper does not deal with theory but, rather, is 
concerned entirely with method of the transfer study. Method issues, 
including the planning, task, students, performance measurement, 
instructors, and analyses, are central to arriving at precise 
estimates of transfer effects — approaching as closely as possible the 
Maximum that might have been demonstrated, providing a goal for the 
operational instructor pilot.. Study models discussed include those, 
for percent transfer of learning and for the transfer effectiveness 
ratio. Use of the latter should be essential in providing answers to 
contemporary questions concerning how much simulator pretraining can 
be used to replace aircraft training time — without reducing the 
production of the combat effective pilot* The guide was developed 
through review of published and unpublished studies of transfer of 
training from ground-based simulator to actual plane flying done 
during the past twenty or more years. Eleven steps were identified 
and are sequenced in the guide for use by researchers. (Author/KC) 
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SUMMARY 



Objective 

The objective was a practical guide for use in conducting studies of the transfer of learning from training in a 
flight simulator to performance in an aircraft. 

Background/Rationale 

Studies of transfer of learning usually have the goal of providing information about the effectiveness of 
training techniques and/or equipment for use in designing or upgrading training programs. The likelihood that the 
information will be used impends on the extent to which both study method and results arc convincing in the eyes 
of the operational user. Studies demonstrating large performance effects resulting from simulator pretrahiing 
certainly will be the most convincing and, other things being equal, will be the most likely to promote the adoption 
and use of new training techniques or equipment during operational flight training. 

During the past three decades, numerous studies have investigated the effects of training in ground-based 
flight training devices on subsequent performance in the aircraft. These studies have employed a variety of 
experimental techniques. Some of the techniques used were scientifically sound, while others were 
methodologically flawed and resulted in findings of questionable validity. This diversity of approaches probably 
resulted in large part from differences ;p. the scientific sophistication or applied research experience of the 
investigators, as well as conditions peculiar to the specific settings hi which the studies were performed. A review 
and consolidation of the lessons learned from previous studies should be beneficial in guiding future efforts 
towards increased validity and practical utility. 

Approach 

The approach used was to review published and unpublished information on transfer of learning and 
experimental design relevant to pilot training. This information was then carefully analyzed to identify the key 
issues and factors that must be considered ir. order to conduct useful transjer-of-lcariiiiig studies in a flight 
training environment. Finally, a sequence of steps to be followed by the practical researcher in conducting credible 
studies was developed and put in guidebook form. 

Specifics 

The concept of transfer of learning is defined in the guide as any measurable effect of training in a prior task 
on performance in a subsequent task. The procedures of the typical transfer study arc described, and two measures 
of transfer of learning (i.e., percent transfer and the transfer effectiveness ratio) arc defined. Initial discussion of 
the transfer-of-lcarning study emphasizes the importance of planning. The remainder of the report identifies and 
describes 11 steps to take in performing a successful transfer-of-learning study. 

The first step is definition of the immediate problem. Its importance is illustrated by asking and considering 
the answers to a number of questions that serve to focus and sharpen the definition of the research problem. 
Selection of the task or tasks to be trained is the second step identified. Criteria for selecting the training tasks are 
suggested. In addition, reasons for identifying research resource requirements early in the study are pointed out. 

The third and fourth steps involve the determination of what learners should be involved in the study and the 
identification of appropriate performance measures. A number of critical aspects of these ^sues arc discussed, 
including the composition of the sample of learners, their assignment to study groups, and the development of 
objective performance criteria to serve as a basis for evaluating the learner's performance in the simulator and in 
the aircraft. 

The use of the instructor as a research participant, and, how to plan sufficient time for the study, arc the fifth 
and sixth steps. The seventh step involves the avoidance within a study of factors that may dilute transfer of 
learning. Advanced scheduling and the need for planning the *tudy to he run in the midst of normal flying training 
operations arc cmplasizcd in steps eight and nine. 



Slcp leu, testing the methodology before collecting final data, and step eleven, the analysis of the data, 
conclude the presentation of the procedures for conducting a transfer-of-learning study. 



Conclusions/Recommendations 

This guide provides the practical researcher with valuable guidelines for conducting studies of transfer of 
learning from training in a simulator to performance in aircraft, fn addition, the guide is applicable to a variety of 
synthetic prctraining environments, including a mix of ground training facilities such as audio-visual media, part- 
task trainers, and relatively sophisticated simulators. 9 

It is recommended that the guide be given widp distribution in both the training research and operational 
training communities. «*' 
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PREFACE 



This report was prepared under Consulting Agreement RI-81923 (Revisedj with the University 
of Dayton Research Institute with Dr. Harold D. Warner as project director. This report is a segment 
of a larger University of Dajlon Research Institute effort conducted under contract F336I5-77-C- 
0054 with the Operations Training Division of the Air Force Human Resources Laboratory Williams 
AFB. Arizona. The report represents a portion of the on-going work within the Air Combat Training 
Research Subthrust, and specifically the Flying Training Specialized Support and Data Base 
Integration component. The associated Project Vanguard planning stunmar) mission area is Support 
and Technical Base development. 
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CONDUCTING STUDIES OF TRANSFER OF LEARNING: 
A PRACTICAL GUIDE 



I. INTRODUCTION AND PURPOSE 

This report lias been prepared for use by the prartiral researcher who is concerned with studio of 
transfer of learning from pretraining of pilots in a simulator to their performance in aircraft. The 
expressions "transfer of learning" and "transfer of training"* tend to be used nearly interchangeably. 
Although the distinction may be somewhat trivial, the former is used here since it is the learning, not the 
process of training, that may transfer from work on a pri'r task to performance on a second. Also, while 
the term "simulator" is used here for purposes of brevity . it is not intended to^be restrictive in nature but. 
rather, can be considered to refer to various types of synthetic pretraining environments— frequently a 
mix of classroom facilities, audio-visual facilities, part-task trainers, and relativel) sophisticated 
simulators. While much of the language in (his report will refer to pilots, flight simulators, and aircraft, 
man) of the issues should be applicable to other contexts, including training of other aircrew members or. 
for that niatte*r. training of individuals who have quite different tasks to perform. 

The- report will not deal with theory (such as the question of what transfers) because such issues are 
cov«rcd elsewhere. JIie concern will be entirely with method of the transfer study, including the 
consequences of failure to follow empiricall) derived principles. The material stems principal!) from the 
experiences of the author and his associates, beginning with their work under guidance of the late 
Professor Vlexaiider G Williams. Jr. who directed pioneer studies at his original Aviation Psychology 
Laboratory of the University of Illinois. The report* submits techniques and lessons learned from 
experience, 'dating perhaps from 1949 when few prior rules were available to the researcher. Descriptions 
of many of the techniques were not included in early papers for \arious reasons, and stiil other techniques 
ma) have been considered too obvious to note. Over intervening )ears. however, it has become dear that 
mail) of the issues are not at all obvious, and since the) have been of great service in a number of previous 
studies, the intent here is to make them available to others concerned with transfer research. 

Issues of research method to be discussed have been found essential during attempts to arrive at 
estimates of transfer that are precise— approaching as closely as possible the maximum that might have 
been demonstrated during a particular study. Studies of transfer of learning are fragile in the sense that a 
stud) that ignores too mail) issues of method is likely to lead to inconclusive results. Such inconclusive 
results are serious because they can lcacl to disinterest on the part of both the research community and the 
operational training community —disinterest hi factors such as new instructional techniques or special 
aspects of equipment used in the study. The resulting disservice is clear, considering that a carefully 
planned and conducted study might have led to entirely different types of results supporting concepts that 
mightliave been used with considerable value to the research and training communities. 

At first glance, the traiisfer-of-leaniing study can appear deceptively simple when actually it is not. 
The number of important issues can b< legioiK and the precision of subsequent results depends on the 
compounding effects of many factors. 

II. MODELS OF THE TRANSFER OF LEARNING STUDY 

Percent Transfer of Learning 

"Transfer of learning 4 " is defined here as any effect of learning resulting from pretraining on a prior 
task (or set of ta;Jks) upon performance in a subsequent task (or act of tasks). Such a transfer effect, if it 
exists at all. could be facilitating in nature — comparative performance data suggesting positive transfer— 
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or it could be interfering in nature — «omparati\c performance data *ugg«*ting negatm tran*f<r I <! u* 
a.*.*uine at the out*et tliat tin* earef,«!!\ planned and conducted *tud\ will In 1 <otucriicd with a po*ni\< 
trati.*fcr effect. 

VUiile \ariou* formula* ha\e hern offered for u>e in the peneift transfer of learning mom^d'JIi*, 
l%!>: (rague. Fore.*ter. and Crowle\. I«M8. MunhxL l*).)7) old;, one utll he ioii*i<lcred here The model 
make* u*e of a roiitroi group of >tudenl* (who are not pret rained on a prior taek and whu*t p< rformam t 
data on a >ub*cquent ta*k *er\e a* a *tandard) and one or more experimental group* of *tud< n t - (who an 
pret rained on a prior ta.*k ami who*e perforinain e data on the >ub*equciit ta*k an toiiipand to lho*c of 
^-control >tndent> for purpo*c* of c.*tiiiiatiug an\ tran.*fer effect realized), l or tju purpn.*< of thi* *tud\, flu 
prior ta*k(>) nun he earned out in a simulator (or other >\iilhcti< training m\ iroiiment), with the 
*ub*cqucut ta*k(*) being tarried out in an aircraft. The model i* 

H - \ (100) = percent lran>fer of learning 

j 

ft 

where: - ^ 

\* 

. * L: an average of triak time or error* accumulated h\ a control group of *tudeut* to arn\e at a 
% performance criterion in the aircraft. 

- \. an average of. trial*., tunc, or error* accumulated h\ an experimental group oK-Tinlcul* to 
^ arrive at that *amc perfoniiauee criterion in the aircraft, haxiiig heen pret rained to a 
performance critccioii in a Emulator. 

Thu.*. u*ing illu.*lrativc iiuuiher*: 

{0 . (inn) = .lO.percent transfer of learning. If tho*e \alue* rcpre*ciil hour* of training in an 
"TtT" aircraft, prelrainiug of experimental MudeiiN in the simulator refilled in a .i0- 

^ percent *aviug in aircraft training time — on the axerage. 

The numerator of the percent lra.i*fcr of learning formula would haw- to he rexer*ed if mea*urement 
uere ^ term.* of performance grade.*. m.cIi that higher wiluc* represented heller performance, lliu*. 

"\ - fl (100) = percent transfer of learning 

C 

where: 

\: an average of grade* aligned to ex peri menial *ludc.it* for performance in the aircraft 

(!: an average of grade* aligned to control .*ludcnt> for perforinance in the aircraft. 

Th„>. if .Indent* were graded using a 12-poiut >eale (with 12 heiug *uperior performance ami 0 being 
total failure), u*iug illuMrative number.*: ^ 

[030 - 8.7:> (100) = 20-percent trailer of learning. In thi* ca*c. pretraining of experimental 
H?73 .indent.* in the Emulator recited in a 20-perceiil higher grade than 

attained bv the control *lue|eiit.*— on the average. 



1 



The Transfer Effectiveness Ratio 

Recent concern of the pilot training community with increasing costs and shortage of energy led 
Roseoe (1971. 1972) to state quite a different model. Being concerned with the value of time, the model 
provides an estimate of transfer effectiveness, using as «1 standard a measure of the amount of simulator 
pretraining required by an experimental group of students to evidence superior performance in the 
aircraft as compared to performance of a control group of students. The estimate can he given In. 

C - X = the transfer effectiveness ratio 
where: 

C. an average of trials or time required In a control group of students to arrive at a performance 
criterion in the aircraft. 

\. an average of trials or time required h) an experimental group of students to arrive at that same 
performance criterion in the aircraft, having heeu pretraiued to a performance criterion in a 
simulator. 

\ an average of trials or time required h) the experimental group of students to arrive at a 
performance criterion in the simulator. Thus, using illustrative numbers: 

i . 

10 - 5 ■ 1.0 the transfer effectiveness ratio. If those values represent hours of pretraining in 
5 the simulator and hours of training in the aircraft, respectively. I hour of 

pretraining in the simulator saved 1 hour of retraining in the aircraft — on the 
average. 

\s tan he seen, the difference hetween the estimate of the percent transfer of learning and ihe 
transfer effectiveness ratio is that the former ignores the amount of pretraining required in the simulator, 
and the latter lakes that factor into account. Contemporary questions concerning how much aircraft time 
might he replaced with simulator time could he addressed principally through studies using the transfer 
• effectiveness ratio. 

\ later section of this report will consider the problem of the time required for the transfer study, 
noting that the/ transfer effectiveness ratio model may suffer more from insufficient time to complete the 
shidy Further, since data necessary for the transfer effectiveness ratio model can be used to compute 
perc ent transfer of learning estimates, there may be occasions when it would he of value to use both of 
these models \i\ the same study. 



HI. THE IMPORTANCE OF PUNNING 

It seems likely that more studies of transfer of learning do not succeed because of inadequate 
planning and preliminary work than because of any other factor. The study must he planned carefully if 
results are to he of any real and practical value, and both planning and the study take time. During the 
planning phase, a sound investment in time is necessary to carry out the work to be described here and to 
identify and correct or adapt to the problems and the less than optimal limiting factors that may be 
imposed by real-world consirahts. 

Preliminary Work ("Testing") 

Ah is the case with any for.nal study that costs time and money, the study of transfer of learning 
should not be conducted without sound preliminary information that suggests the type of outcome likely 
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to be found. The formal study should no! be conducted in an exploratory maimer to establish trends or 
directions of findings but rather, it should be conducted to arrive at an estimate of the magnitude of a 
transfer effect. It should be concerned with reasonably substantial effect* that could be of practical 
significance in the real world — not with statistically significant trivia. 

Trends, directions of findings, or the likely existence of a positive transfer effect should be 
established during one or more relatively simple tests from which ideas, hunches, or hypotheses evolve 
While the precise nature of such preliminary work will depend on the particular problem of the moment, 
in some cases early testing might be fairly simple, using onl) a few students, relatively simple equipment, 
and perhaps relatively crude performance measurements. Preliminary "mini-studies" — assuming that 
they involve reasonable care— can be invaluable, particularly if several experimental students who have 
been prctrained in *ome specific manner seem to show dramatically superior performance in the air as 
compared to performance of several controf counterparts. Information obtained in this way can lead to a 
highly useful formal study. 

Among the other valuable insights that might be provided by preliminary testing, deficiencies of the 
simulation equipment could result in negative transfer effects. Preliminary «ork can help to identify such 
problems, together with a means for solving them; in this case, planning for the process of training for 
transfer— a subject to be discussed in a subsequent section of this report. 

Designing for Maximum Possible Estimates of Transfer 

The goal of the researcher should be to plan and conduct a carefully controlled study , taking every 
possible precaution in the design to ensure \'. it the resulting estimates of transfer are precise— that is, that 
they approach as closely as possible the maximum levels that could he demonstrated. Because of 
uncontrollable variables, research-demonstrated techniques could result in less than optimal transfer 
effects when used in an operational tr ining program, still the researcher should attempt to demonstrate 
the maximum possible transfer effects to show what can be accomplished and thereby provide a goal for 
the operational instructor. Without knowing what could be done, the operational iiKlrurinr eon Id tend to 
be satisfied with lesser results. 



IV. THE FIRST STEP: DEFINITION OF THE IMMEDIATE PROBLEM 

Although the underlying question concerns the extent to which preleaniing in a simulator will 
transfer to performance in an aircraft, the first step should involve consideration of the specific purpose of 
the particular transfer study. Various specific purposes can have different associated problems such as the 
following. 

Will the study bo concerned with combat readiness of experienced pilots facing reductions in aircraft 
time for skills maintenance and reacquisition training? Prior to asking whether lost aircraft time might be 
replaced with simulator training, preliminary work should have to do with an assessment of degrees of 
combat readiness. Is there evidence of decay of skills with reduction of aircraft time? 

Will the study be concerned with effectiveness of basic pilot training in the face of reductions in 
aircraft time? Prior to asking whether aircraft time can be replaced with pretraining in a simulator, it 
would be well to be sure that effectiveness is actually reduced. 

Will the study be concerned with experienced pilots in transition to a new type of aircraft and 
mission? A preliminary question should ask whether there exist facilities that are truly adequate for 
pretraining work. 

Will the study be concerned with pile lS returning to flight duties from predominantly administrative 
assignments? Again, arc there facilities that are truly adequate for pretraining work? 
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Although much of route mporar) interest in using simulator pictrainiiig is motivated h) concerns 
with costs of aircraft time and the energ) problem, the nature of the synthetic training environment is 
such that it can provide benefits ov;r and beyond those of saving money or fuel. Does the purpose of the 
.study involve one or more of the following issues? 

A well designed simulation facilit) can be used on an all-weather, 24-hour basis and as a substitute 
when training aircraft are not available. In addition, it can provide a safe training environment, it can be 
used to compress time during training, enabling concentration upon critical segments of flight tasks rather 
than requiring that time be lost while flying to and from a practice area; and it can provide opportunities 
for observation and measurement of student performance that ordinarily are not possible in the air. The 
student can be interrogated easil) on the sp *t concerning reasons for errors, and exercises can be rendered 
standardized and repeatable, affording vcrv prcJae assessments of learning progress. In the event that the 
specific purpose of the study involves one or more of these issues, perhaps the major concern lies with the 
measurement of percentage of transfer of learning rather than with arriving at an estimate of transfer 
effectiveness. 

In any event, i! seems important that the researchers have identified all aspects of the purpose of the 
transfer study being conducted. 

V. THE SECOND STEP: DEFINITION OF THE TASK 

Transfer of Learning for What Phase of the Curriculum? 

It is impracticable to attempt to measure transfer of learning for an entire curriculum through a 
single study . Thus the study is likely to be concerned with a specified phase of a training curriculum, such 
as training for takeoff, approach and landing, instrument flight, attack on a ground target, air-to-air 
attack using a weapon-control subsystem, or other meaningful phase that has continuity. In some cases, it 
might be that even a particular phase is too complex to be dealt with in its entirety, requiring study of one 
or more segments. If it is desired to arrive at transfer estimates for several phases of a curriculum, it may 
be necessary to establish their order of priority. 

Decisions in this context must depend on requirements of operational organizations, and necessary 
background details must originate from those organizations. The contributions of highly experienced 
instructor pilots arc very important during the early planning stage, and some studies may require 
contributions on the part of additional operationally experienced pilots who arc not necessarily 
instructors. 

What Specific Tasks will be Involved? 

At the outset, the research team must derive definitions of tasks the student will be expected to 
perform in the operational situation represented in the study. Precisely how this is to be done will depend 
on the nature of the particular study. Past work has made use of operational sequence diagrams and 
pictorial diagrams of flight tasks. If the curriculum phase has been selected with care, use of such 
analytical techniques should result in a convenient number of tasks that can be defined fairly tightly. 

The instructor pilot can be of great help during this work by noting high frequency errors that have 
been made in the past, task segments that are of time-critical nature, and cues that appear to be necessary 
and sufficient in facilitating performance. These concepts will be considered further during discussion of 
performance measurement techniques because it is essential that measurement and tasks be related 
closely. 
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VI. ASSESSMENT OE RESOURCES: AN ITERATIVE PROCESS 

After arriving at a rcasonabl) thorough set of task definitions, the research train must be certain that 
resources available will enable conduct of the stud). That question has to be addressed continuous!) as 
planning progresses. Will a\ailable simulators be adequate for use during ^retraining for the specified 
tasks? Will pertinent aircraft — in which "proof of the pudding" performance measurements must be 
taken— be available and in .sufficient numb *' Will an instructor cadre be available and in sufficient 
numbers.' Will students of the necessar) l)p< he available in sufficient numbers? Will it be possihh to 
run a carefull) controlled stud) in the midst of a bus) operational training schedule? Will there be 
problems in getting necessar) support from the commander and the operations officer of the training 
organization ? VI ill all of these enabling factors ( pntinue to be available during the time required to earn 
(lie study to completion? 

Insufficient') of too mail) enabling fador* could render conduct of the stud) inft-n&ihlc or at !ea*t 
conld impose serious constraints on what can be accomplished. Tim* the research team would do well to 
keep in mind the question of adequac) of available resources during the entire planning process 

VII. THE THIRD STEP: WHICH STUDENTS WILL HE INVOLVED IN THE STUDY? 

It ma) be that the question of which students will be invoked in the stud) can be answered b> the 
nature of the immediate problem and the nature of the curriculum phase and tasks of interest to the >tud> . 
Earlier, four categories of pilots were mentioned, pilots requiring skills maintenance and reacquisition 
training for combat readiness, students in basic flight training, experienced pilots in transition to a new 
tvpe of aircraft and mission, and pilots returning to flight duties from predominant!) administrative 
assignments. ClearK those categories of pilots represent at least four ver) different populations — prohahl) 

far more than that. > 

J- 

Sometimes the researcher ma) be templed (o extrapolate the transfer stud) data as far as possible, 
perhaps wanting to arrive at more information than actual!) is feasible. The notion of mixing siudmls 
representative of several different populations of pilots in a single stud) is a case in point. But if that is 
done, with the total sample of students being mil) of modest size, it is unlikelv that results could be 
applied to specific training situations. The rule should be lo keep I he student sample as homogeneous as 
possible— particularly when only small samples are available. 

Size of the Student Sample: Representative of What Population? 

The most frequentl) asked question ma) be that of sample size but. iiiifortimatcl\ . there rarel) 
seem: to be a Iml) satisfaclor) answer. Perhaps the mosl useful approach is to lr) to keep the sample(s) as 
representative as possible of a population of interest. 

Ideall). the control and experimental students should be matched in terms of experience and 
aptitude for the lasks at hand, but in realit). the notion of what "experience" reallv means is imperfect, 
and the training research t ommunit) would appear to have few trill) useful tests of aptitude for specific 
tasks likel) lo be involved in transfer studies. The total number of flight hours logged probably pla)s a 
role in a definition of '"experience." but there Is al least some empirical evidence that this is bv no means 
an entirely useful predictor of performance levels. 

It seems popular to state that the sample size should be as large as the situation permits and. in one 
sense, that is probably correctjf. in an extreme case, ever) member of a particular pilot population could 
be sampled, the accurac) of the predictions concerning transfer would be vast I)' improved. But tha: is 
sheer fantasy, and in the practical world researchers usuall) have to make do with relalivelv small 
samples, the sizes of which are limited by time, funds, and student availability. However, there is no 
magic in large samples. A small sample composed of highlv representative students is likel) to vield 
information of considerable value, whereas a large sample that is either heterogeneous in nature or is 
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characterized by a bias of some kind is likely lo yield misinformation. Furl her, aii) effort —a* a transfer 
estimate—that requires ver) large samples, to show itself is unlikely lo be of praclieal significance (Hays, 
1973 pp 419-429; McNcmar. 1940). 

c Efforts lo match student samples between or among conlrol and experimenlal groups in past studies 
have had to be made using a great deal of common sense and in terms of t) pes of students available. Some 
icscarchers have used a combination of length of experience and experience in specific l) pes of aircraft, 
attempting lo place equal numbers of such students iiuhe several groups. 

If the total available supply of students appears lo be reasonably homogeneous— if al leasl there is no 
specific reason lo predict an imbalance of aptitudes and skills— perhaps the best that can be done is lo 
assign students lo the several groups on a purely random basis. The principal concern, of course, is that, if 
predominantly more apl student* are assigned lo a conlrol group, a spuriously low transfer effect is likcl) 
to be demonstrated, and conversely, if predominantly more apl students are assigned lo an experimental 
group, the demonstrated transfer effect is likcl) lo be exaggerated. So, if relative!) small groups must be 
used- perhaps 8 lo 12 students per group- how severe is ihc problem? 

Suppose ibal a lolal of 16 students were available. 8 being assigned (o an experimenlal group, such 
assignments being made al random because there was no real reason lo suspect serious differences in 
aptitude. Suppose further that the 16 students acluall) were ordered in aptitude for the lask al hand but 
that there was no way lo estimate thai ordering. This means dial eight of the students are the more apl. and 
with luck, four of them would be assigned lo each group. A problem would arise if all eight or seven or six 
or five of the more apl students had been assigned lo ibe same group. So, binomial probability can be used 
to estimate the chances of that happening. 

P(r/n. p) = (n) (p) r (q)"- r 
where by definition, p = q = .5. 

a The probability lhal all eight of ihr major apt ^iuk nt= had bi t n assigned to the runic group is 
about . 004 (4 chances in 1.000). 

b. The probability lhal seven of the more apl students had been assigned lo the same group is about 
.03 (3 chances in 100). 

c. The probability lhal six of the more apl sludeuls had been assigned (o the same group is about .11 
(II chances in 1 00). 

d. The probability (hat five of the more apl students had been assigned to the same group is about 
.22 (22 chances in 100). 

<\ The sum of these probabilities- the probability dial eight or seven or six or five of the more apt 
sludeuls had been assigned (o the same group— 4s about .36 (36 chances in 100). 

While il is realized dial ibis illustration involves a somewhat simplified set of assumptions (il does 
not, for example, lake into account the relative aptitude ranking of the eight more apl sludeuls), it does 
serve to suggest lhal the probability of absolutely mismatched groups is quite low (p ^ .004) and thai the 
range of probabilities — from seriously mismatched lo moderately mismatched groups- is aboiu .03 lo .22. 
Tin *se are fairly good odds in fax or of a reasonably well matched group. What is more, if the study actually 
does involve a sizable transfer effect, lhal effect should show itself even under the less favorable of these 
situations. 
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VIII. THF FOURTH STEP: WHAT PERFORMANCE MEASUREMENT TECHNIQUE? 

Relationship with Tasks: Validity 
of Performance Measurement 

While earlier work concerned with definition of the tasks will have placed reasonable bounds on the 
transfer study, more detailed definitions of the student's tasks have to overlap work for development of 
the performance measurement technique. While the absolute nature of the performance measurement 
technique will depend on many aspects of the particular stud), it is essential that tasks and measurement 
be related logically. To the extent that such a relationship is established well, validity of performance 
measurement wiil j list about take care of itself. 

The Sequence: Tasks/Criteria/Limits 
Allowable/Performance Measurement 

Although means for expediting the process are likcl) to differ from study to study, it seems 
reasonable that consideration of the sequence to be illustrated iua> be central to establishment of a 
necessary bridge between task definition and measurement. The sequence implies the following steps: 

Define the Tasks Operational 7y— Exactly what will the student be required to do? Depending on the 
complexity of the tasks, this may be defined at various levels of detail. 

Set Criteria for Performing the 7asfa— How are these criteria established b) physical fact> of the 
tasks? 

S/x?n/v Deviations from Those Criteric That Can lie Tolerated— In the same kinds of terms used to 
define the tasks and performance criteria, what performance limits likely will permit of successful 
completion of the tasks? 

Structure the Performance Measurement Units and Mains fw Taking l)ata-l\ is at this point that 
the process is likely to become iterative, the question being whether desired types of data can be taken. 

Illustration: Number of Trials (and/or Errors) 
to Performance Criterion 

The sequence can be illustrated with an example from an early study concerned with transfer of 
learning in the context of making approaches and landings (Payne. Dougherty, Hasle, Skeen, Brown, and 
Williams, 1954). Experimental students were pretraincd for the task in a simulator, where they were 
required to achieve a performance criterion prior to moving to the aircraft. Their performances and those 
of their control student counterparts were measured during retraining in the aircraft. The study used a 
measurement of the number of trials and errors accumulated before arriving at a total task performance 
criterion. The illustration to follow is concerned only with performance in the aircraft (the sequence used 
with experimental students in the simulator having been nearly identical but somewhat attenuated 
because of limitations of that device). 

Definition of the Task (abbreviated here) -The instructor positioned the aircraft for a 90-degrcc side 
approach from the left, giving control to the student at this point, the student was required to make 
necessary power reductions, the turn onto the final approach, the approach proper, the flare, and the 
touchdown for a wheel landing. The task ended after the aircraft executed a short posttoiichdown roll. 

(For convenience, performance criteria, performance limits, and the performance measurement 
process are illustrated in tabular form.) 



12 

- . 18 



Performance Criteria 



Performance Limits 



Performance Measurement 



From starting position to 
position on wimllinc (with- 
in imaginary extensions of 
runway edges): 

I. Airspeed: 90 niph 



2. Turn onto approach 
was not overshot: 



3. Turn onto approach 
was not undershot: 



4. Aircraft wa» on w Mid- 
line prior to passing 
airport boundary fence. 

5. Student was assisted 
in no way. 



From position on vtimllinc 
to position over end of 
runway: 

6. Airspeed: 90 mph 



7. No S-luriis outside 
of wind line. 

8. Manifold pressure 
at 15 in. Hg. 



+ 10 to -5 mph. 



Did not pass wind* 
line; turn completed 
within runway width 
(150 ft). 

Did not fail to 
reach wind line; 
turn completed 
within runway width 
(150 ft). 

Was within windline 
(150 ft). 



None. 



+ 10 to -5 mph. 



Did not depart from 
w indline (150 ft). 

+ 5 in. Hg. 



observed on instructor"* 
airspeed indicator. 

Observed by instructor 
from rear seat. 



Observed by instructor 
from rear seat. 



Observed by instructor 
from rear «eat. 



Instructor did not 
assist student verbally 
or by control action. 



Observed on instructor's 
airspeed indicator. 

Observed by instructor 
frojn rear seat. 

Observed on instructor's 
manifold pressure 
indicator. 



9. Glidepath aimed at 
a definite point 
within first third 
of runway. 

10. Aircraft crossed near 
end of runway at 100 ft 
altitude. 

1 1. Student was assisted 
iu no way. 



A point between near 
end of runway and the 
one-third marker. 



+ 50 ft. 



None. 



Observed by instructor 
from rear seat. 



Observed on instructor's 
altimeter. 



Instructor did not 
assist student verbally 
or by control action. 
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Point of touchdown: 



(2. Touchdown executed 
with in first third 
of runway. 

13. Touchdown executed 
in center of runwav . 



14. Student was assisted 
in no wav. 



A point between near 
end of runway and the 
one-third marker. 

At lea*! one w heel 
within two white 
center lines. 

Aircraft touched down 
on main wheel*: student 
allowed aircraft to 
roll (or to skip lightly) 
to demonstrate that no 
serious bounce would 
take place. 



Observed hy instructor 
from rear ."('at. 



Oh>erved hy instructor 
from rear seat. 



Instructor did not 
assist .student verbally 
or h\ control action. . 



Several point* are of interest: 

a The I I >et> of criteria, limits, and measurements were developed dining a great deal of 
preparatory work. The task wa> carried out using the AT-6 airc raft, with power settings and airspeed 
being standard for the type of approach and landing used (then called a "transport landing"). (Simulator 
pretraining work with experimental student* used the 1-CA-2MT-G Link Trainer, modified to proxidc a 
dynamic projection of the runway image.) Task limit.* were established by the instructors while oh*cmug 
from both the aircraft ami the ground. The glidepath angle was measured using a siineyor's instrument — 
a theodolite — enabling establishment of points for beginning the maneuver ami flying the approach with 
00 (ii:itnif>) mph. rpm. 30 in Hg of manifold prc-nrr. grar and full flaps down, Thu> the MibtusL* 

and their performance limits were judged to be entirely \alid dese riptors of successful execution of the 
maneuver. c 

b. The instruc tor said nothing during each of the student's trial*. As the student performed a trial, 
the instructor made necessary observations and entries for the 11 performance units. ()nl\ after 
repositioning the aire raft for starting a subsequent trial did the iiistrm lor make < omments and i orrcc the 
remarks. Had instruction taken place during a trial, the measurements i ould haw reflet led those remarks 
as well as the student's performance — the two being confounded absolutely. 

c. \ successful approach and landing were defined as the student's having met all I I subcriteria. 
missing even a single item was defined as an unsuccessful trial. In this stud), the instructor scored 
performance as it occurred, the process having been possible because of the- tandem, two-place aire raft 
used. Observations were recorded using a standard, knee-clipboard form. 

el. The student met total task criterion performance at the point of having made three consecutive 
successful approaches and landings. Preliminary work had indicated lhat Mich performance was highly 
unlikely on the basis of chance alone. (Tests bad shown that once this "thrce-iu-a-row" < ritcricm was met. 
the student tended to execute a long series of successful mancmers before a subsequent "out-oMimils" 
observation occurred.) 

e. Some of the subcriteria for successful performance in terms of individual units were of relatively 
subjec tive nature and. sometimes, were difficult to scorer. (Windline examples are a case in point.) It was 
found necessary to impose a rule that the instruc tor give a lk within-liruits** score for any measure rue tit unit 
about which there was any doubt. Preliminary work indicated that, using this rule. obscr\cr-obser\er 
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rcliahilil) of scoring approac lied unit). While llu rule had ihe effect of widening acceptable performance 
limit?! somewhat, the total measurement let unique proved to he highl) sor Aiwc to differences in goodlier 
of performance. 

f. This technique provided the stud) with .several different type* of estimates of transfer of 
learning. While the principal interest in\ol\ed percent transfer in terms of mimher of trials to 
performance c riterion and number of errors during trials to performanc e c rilerion, it was also possible to 
estimate first-trial transfer in terms of errors and to estimate transfer in terms of errors made 1 during the 
first five trials. 

g. In 1953. when the stud) was conducted, primar) interest was with the percent transfer of 
learning — not the transfer effee tiventss ratio. luforluiialel), records that would ha\c enabled calculation 
of a transfer cffccli\ciicss ratio (long after the fact) were lost. Those rec ords showed number of trials to 
performance eriterion for the experimental students during simulator p ret raining. 

-Illustration: Performance Grading 

The process of establishing tasks, criteria, limits, and performance measurement c an he illustrated 
with an example from amnion; recent studj concerned with transfer of learning in the context of air combat 
maneuvering (Northrop Corporation. 1976). The; stud) was concerned with the percent transfer of 
learning for experimental students who had been prelraiued in a special simulator, using an instructor 
grading >\stcm because the portion of the training s) Malms that could be invoked was too short to permit 
measurement of number of trials to a criterion. The* sequence to be described is concerned onl) with 
performance in the aircraft. 

Definition nf Tusks, Cntetiu, and Pvrformume Limits— Tasks consisted of eigh* basic maneuvers 
used in an air combat maneuvering training s) Malms. Instructors provided descriptions of these 
maneuvers, eacli of which was divided into logical segments, together with c riteria and criterion limits for 
successful performance 1 . Measurement units were based on these descriptions, together with the tvpes of 
high f requeue) student errors that had been observed during operational training. 

Peijvruiunie Measurement (Gtading)~\\ was not feasible to grade 1 performance while airborne 1 
because of vcrv short durations of critical maneuver segments, together with the high g forces involved. 
Therefore grading was done on the ground immediate!) following the training flight. Instructors Used 
standardized grade sheets, showing the several measurement units, and indicated the l)pe of uiaueuve rs 
used in each engagement. The two iiistruc tors who had worked with the student were required to grade 
measurement unit on a consensus basis. 

The Guiding Seu/e— Iiistruc tors graded eae h measurement unit using letter grades of the following 
scale; 
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Grades 



Definitions 



Numerical Equivalents 
(enabling analyses) 
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A+ 12 

A Superior 1 1 

A- 10 

B+ 9 

B Above Average 8 

B- ' 7 

C+ 6 

C Average 5 

C- 4 

D+ 4 3 

D Below Average 2 

D- . 1 

P Failing 0 



Instructors used the scale in two stages (not being concerned with numerical equivalents). First the) 
rated each unit across the five-point scale: A through F. Second, when the) had entered one of the top 
four categories, the) were asked to qualif) She grade as necessar) to express their judgment with greater 
precision (as: B+ % B. or B— ). This resulted in a highl) sensitive 1 2-point scale that permitted the fine 
differences in performances to be discriminated. 

This tjpn of grading scale has been used in a number of different stud) contexts, in each case 
proving highl) successful for quantifying expert professional judgment. In this particular stud), it was 
necessar) to observe two precautions. First, since there was a marked difference between capabilities of 
the student pilots and their highl) skilled instructors, those instructors regarded the entire range of the 
grading scale as representing t)pes of student performance only. Second, the five basic grading < ategories 
were defined (e.g.. the ""superior" category represented performances of the top 10 percent of students of 
the operational training program). Use of such t)pcs of definitions seems advisable in an attempt to 
standardize interpretations of scale categories. 

Some Questions 

During the process of defining tasks, performance criteria, allowable limits, and measurement units, 
it might prove useful to ask questions such as the following: 

a. Can the tasks be categorized according to segments that have logical start and end points? Do the 
tasks involve equipment limitations (stall speed, g limits)? 

b. At each readily defined, critical mission segment, what is the crux of successful performance? Is 
the judgmental factor or the motor factor the more critical, or arc they of equal importan. e? 

c. How is time critical and at what points? Since it is neither possible nor desirable to attempt to 
measure every aspect of performance, is it possible to associate performance measurement units with 
time-critical periods or segments of the maneuver or mission? These periods arc. after ail. when serious 
errors are most likely to take place. 
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d. Is it possible to measure problem detection latency? Can it be inferred on the basis of subsequent 
action? 

e. Is it feasible to match "lime available" with performance time? Time available for an anion, a 
maneuver, or a mission segment would have to be derived from operational definitions. This kind of 
performance measurement would appe r to be particularly pertinent in terms of combat mission 
segments. Did the student do the correct thing but take too long to do it? 

f. At each readily defined, critical mission segment, is it possible to list the types of errors that 
students frequently have tended to make in the past? 

g. Is it possible to delineate a reasonabl) small number of aircraft actions or positions involved in 
carrying out tasks — these being placed in descending order of desirability ? Particularly in cases involving 
single-place aircraft, this ma) prove to be an essential measurement category —the instructor having to 
make a judgment from a position in another aircraft. 

h. Is it possible to estimate the student's level of concentration? This might involve the use of 
secondary tasks in an attempt to estimate the amount of effort required by the student. Aspects of tasks 
permitting, the studem approaching a high level of learning should have more lime and energy remaining 
for executing additional tasks. 

Relatively Molar Performance Measurements 

It is suggested that the researcher should not necessarily avoid measurement of performance in 
relatively molar terms as long as the measurement units are anchored to dear definitions of important 
tasks, clear definitions of what the student will be required to do, and < lear definitions of consequences of 
serious delations from the limits provided. Transfer studies should look for large 1 performance 
difference* that could be of prac tical significance— nul small differences no matter the level of statistic al 
significance. Measurements should not deal with molec ular trivia simply because they are easy to define 
and measure. 

Recording Techniques 

Past work has made use of both hard copy — forms with pencil entries— and tape recordings. In the 
main, however, hard c opy has seemed to be: the more Useful. For oner thing, the printed scoring or grading 
form provides a dice Mist of items to be covered. For another, transcribing or listening to tape contents is 
sexerely lime consuming. And, depending on I her type of recorder Used, maneuver g forces tan slow down 
the mechanisms, rendering subsequent playback less than tndy clear. Whether technological advances 
and budgets will permit use of forms or truly useful automatic airborne recording techniques remains to 
he sere ii. 

Automated Performance Measurement Systems 

There 1 Would appear to he an unfortunate belief in some quarters that an automated performance' 
measurement system, as sue h, implies associated validity of data. That is, of course, just not so. Validity of 
measurement data depends on the am hor to reality and has nothing to do with how the measurements are 
implemented. It might be useful, however, to consider three services that an automated measurement 
system mi^ht provide;— those services possibly solving some problems facing the human data taker. 

Reliability — An automated system, being subject to less variability in operation than is the human 
observer, should provide measurement data of greater reliability in the sense of measuring the same type 
of event from trial to trial and from student to student. Designing manual measurement techniques having 
high ohserver-ohserrver reliability can he difficult. 
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SfKtn of Surveillance— An automated measurement system, could take into account all items it is 
designed to cover— consistent!), not being subject either to distraetion or to a limited field of view as is tin* 
human data taker. Human data takers obtain most their information visually, with the requirement to 
timeshare— they simply cannot look in several directions at once. And even though human observers may 
be required to attend only to a very narrow or highly specified aspect of a visual situation, there is the 
problem of vigilance error, that is, an observer may look at the correct location but too n or loo late. 

Aims /« Injormatwn— Man) difficulties in measuring performance are a functiui »f not being able 
to position the human observer to permit a tie* of the desired e\ents. Consider the single-place fighter 
aircraft or even a two-place aircraft in which an observer in a second seat cannot see either the student's 
control actions or the outside world from the student'* point of vantage. For an observer located in a 
second aircraft, the principal source of information is the d)iiamic physical positioning of the student'** 
aircraft. That is fine from the standpoint that physical positioning is the end product of the sludciit\ 
derision making and action processes, but it tells the observer little about win errors took place. Those 
reasons must be inferred. The observer has to make do with the things that can be seen. 

To the extent that an automated measurement s>stem could be prowded with necessary sensing 
deuce* and be mechanized economical!) and in nccessan lightweight and compact form, it might be 
located within the student's aircraft, solving many of these kinds of problems. 

♦ 

Performance Measurement in the Aircraft and in the Simulator 

.Most of the discussion thus far has been concerned with measuring student performance in the 
aircraft. Airborne performance measurements are essential to the stud) of transfer of learning and 
provide much of the *pa)off" information. But performance measurement during simulator pretraining 
is important too. During the illustration of models of transfer studio, it was noted that simulator 
pretraining should continue to a performance criterion. If that is not done, the notion that learning ha> 
taken place can be something of an act of faith. Stud) results will have more meaning if c\idcncc is 
provided indicating that learning did Like plate during simulator pretraining. This concept hold* for 
either model for the transfer stud), but it may be even more critical for the model concerned with a 
transfer effectiveness ratio. 



IX. THE FIFTH STEP: THE INSTIUCTOKS 

It has been noted earlier that the role of the instructor pilot is critical to the conduct of the stud) of 
transfer of learning. Too frcquentl) in the past this factor has been been recognized fully, insufficient 
emphasis having been placed on the various important contributions of the instructor. This ma) have 
been the ease because of undue attention baid to the nature of the simulator; this having tended to 
overshadow more critical issues. Most »-escarchers tend to he enchanted with elegant equipment, this 
possibly leading to two dangerous semantic tra^s. 

First, it is customary to'speak as though simulators "train": however, the) do not. the) ne\er have, 
and the) never will. It is the instructor who does the training. The goodness of design of the simulator ma) 
he important in providing the instructor with the necessar) training environment, hut it seems unlikel) 
that engineering and <'ost restrictions will allow a\type of simulator to he designed that will provide a 
"work sample" so complete that maximum transfer can occur without superior instruction. 

Second, a nearly universally expression is that someone, "received training/' That unfortunate 
phrase suggests that the training process is passive and is something like slicing cheese. (How man) slice* 
are necessary?) But anyone who knows anything about the training environment that gets things done 
knows that learning is an active process. Students cannot sit there "receiving training": the) must take an 
active role, interacting with both the environment and the instructor. 
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Perhaps mmih da\. there ma) be a training Emulator ciuironuieut that iim-h miiiic modified form of 
tin- nun cpt of loiuputi r aided, programmed in.stnn tion — no liiiiuaii iu.struitor being involved rxicpt for 
purpose* of haoumg >pci ial student problems. Bute* en in miiIi a situation, the iiistrut tiou will rriuain thr 
kc\ clement. Programmed instruction provided With sin h an ad v anted .simulator should he based on 
skills, knowledge. ,iim1 In huiqiics of a large number of instructor pilots, thr hash .situation being similar 
to route mporan cffcitiw training hut taking ad\ antage of >iii*li romhinrd information. 

The Instructor as the Researcher 

Tin* instructor ia<ire must partiiipatc in thr design of thr stud) from thr outset, providing 
information that hi lp* amhor thr stud) to rcalil). partimlarl) \% it li rrspeit to thr nature of tin* task ami 
thr performamc mraMirrmriit technique. Hut o\rr and bcvoiul that work, thr iiistruitor ordinardv will 
conduit the stud) in addition to the rolr of guiding tin student'.* Irariiiug. During i ritii al airborne work, 
thr iiistrui tor is also thr rcsrari hrr and data takrr a.* well a.* thr safct) pilot. W hat is more, tin- nistrin tor i* 
thr mo>t logical individual to hamllr .simulator |>n k trainiug of experimental students. 

Training for Transfer 

Thr tei luiii|iir o( training for transfer has hrr a shown to lie t ritual when fiaturcs of the simulation 
eii\irniiiiieiil ma) hr markcdlv diffrrrut from those to k* riuouiitereil in the air. The simulation 
euv ironmeait. h\ defiu'tioii. \» at \ariamr with thr proton pit al ru\ iroiimeiit. liciaiisc of ph)siial ami 
i iigineering limitations, sunn timr> a.*po t.* of thr >\ utlietit rn\iroiunrut ma) h« diamrtrii all) opposed to 
those of tlic^ operational situation. In sin h ia>e>. there tan r\i>t a ' built-in" effeit that likelv leads to 
negative Iran-fir — imulator |irrtraining po*.*ihl\ providing an iutrrfrriug effeit upon subsequent 
pcrformam i- in tin air. further, in miiiu* « a.*c> it ma) not hr po*.*ihle to iarr> out partii ular *uh-ta*k> in 
thr siiiiutatiir. even though those .siih-ta.sk> arc ver) important in thr air. 

The |iroie.ss of training for transfer involve* ideutif)iug ami bring rrrtaiu that the student 
understand* tin limitation- of thr simulator a> lomparrd to an ainraft. and thr iii>trn« tor i> uiuqiiclv 
qualifird for this ri'*po»sihilit) . It ma) he uc<e.**ar) to perforin a parti* ular fumtioii niie wjv in the 
simulator an I another wa> in tin ain raft — as is appropriate to eai h. Tin student must know ahoul these 
ilifferem e> ami win the) e\ist. It ha* hern found u>rfn! to explain >m tl differem rs to the student at 
frequent intcrv al.* — at lea*t prior to ami during .simulator work and prior to ami during airhorm work. 
The more *e\ere the differences, the more frequentl) the) should he pointed out. 

V 

To illu.strate tin iimiipt. «arl) trau.sfi r .studies used a .simulator requiring « oii.*iderahle rudder pedal 
trawl with ir iniinal >tiik movement to perform a « oordiuated turn ( I -CA-2/AT-ft Link Trainer), while 
the loiiutcrpari .on raft ( \T 6) required cvaitlv the rcversi — litth ruddi r pedal travel with considerable 
stiik movement n«; et al.. IT)!. W illtams X Flr\mau. I°!°). While this i.* a drauiatii example of 
built in pott ntial for negati\e transfer, work in tho.se .studies .showed that, if the prohlem i.* made quitt 
i lear to the student prior to ami during simulator work ami prior to ami during airhorm* work. stub 
training fur transfer complete!) offsets the potential, thr .student having little cliffitiiltv in either the 
.simulator or the aircraft 

The re rent Mud) tiled, concerned with transfer of learning in thr context of air toiuhat 
mam uvc ring, iuvoivrd no f« wrr than 20 aapec t> of the .simulation ciiv iromm ut that differed important!) 
from their airhorm* counterpart (Northrop. 1976). The instructor pilot* idrntifird tliu*e.a.*pci t> ami had 
them pr'ntcd on a .sheet in dot ending order of importance, distributing that .sheet to all e\|n riuii ntal 
atuiliu«>. in addition. thev emphasized the problem* during briefing and debriefing session.* for work in 
both the simulator and the aircraft (F-IJ). The following are some of those aspects: 

a. Target detail definition decreases greatly he)omi i mile, hut the target remains as a •"light 
sourer* out to infinity. 
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b. Simulator provides more instantaneous g than docs the F-4J— at all airspeeds. 

c. Simulator departs at 30 to 33 units and usually caimoi be recovered. 

d. Pulling simub or nose up at high airspeeds is more difficult than in the F-4J. 

e. It is very easy to exceed 6g in the simulator. 

f. The simulator has largcamounts of roll divergence. 

g. Buffet effects are less intense in the simulator than in the F-4J. 

h. Simulator rudder is too sensitive at slow speeds. 

i. Flying ACM in the simulator provides a twilight effect: Is similar to flying at dusk. 

Subsequent conduct of the study indicated that the experimental students were well aware of the 
differences and that they had little difficulty making appropriate adjustments and responses during work 
in the aircraft. Since the set of differences could have provided a marked built -in potential for negative 
transfer, it is likely thai the ultimate information obtained from the study would have been much less 
important except for this process of training for transfer. 

Sensitizing the Student to Necessary ami Sufficient Cues— The process of training for transfer can be 
of value when cues of different types are available in the sim.ilator^and in the air. Although the problem 
may be less severe with today's higher quality of simulation environment*, there may be vtctasious in 
which cues found most effective in the operational environment cannot he produced in the simulator. 
Under such conditions, the instructor would do well to point out differences, noting both those cues that 
are likely most useful in the air and those that can be used for the same purpose in the simulate TJ, «* 
procedure need not be paradoxical because, frequently , different pilots makt use of different sets of cues 
as aids during performance of the same maneuver, these perhaps depending on their individual 
preferences. Even the same pilot may use different sets of cues at different times, such as while flying 
types of aircraft that permit of peculiar angles and extents of view. The pilot makes dc with alternatives, 
that serve the same purpose. * 

Use of Relatively Simple Aid^ 

To aid the instructor during the briefing and debriefing sessions, usually it is a good idea to provide 
models, photographs, chalkboards, or other items of relatively simple equipment that can be used to 
illustrate points clearly. Air combat instructor pilots have made heavy use of a pair of simple wooden 
triangular blocks mounted on the ends of dowel sticks. Use of such rudimentary equipment might sound 
inelegant, but often it appears to serve the .purpose extremely well. 

Rigorous Adherence to the Study Design 

The transfer study, as any other formal study, must be connected under highly controlled conditions 
so that resulting data are not confounded with extraneous events. The goal should be that the transfer 
study reflect only the results of prelrairiing in the simulator. To provide for such control, students must 
work with a common syllabus of tasks earned out in a prescribed sequence, in the absence of free-floating 
variables such as giving a particular student a special exercise (even though, in an operational situation, 
that might be the logical thing to do). Such deviation from a prescribed sequence of events could render 
the resulting data uninterpretable. If the instructors are cc-designers of the study, they will be unlikely to 
deviate from standardized procedures, even inadvertently. 
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No Instruction During Measurement of 
Student Performance 

The stud) design should provide that no instruction take place while the student is performing and 
performance data are f>eing\taken. If an instructor makes a comment (even casuall)) during a 
measurement trial, the resulting data an likel) to reflect that input in addition to (and confounded with) 
the student's ability level. 

Balance of Instructors Between or Among Control 
and Experimental Groups 

One of the surest ways to arrive at biased transfer estimates is to allow imbalance of instructional 
techniques and styles among groups. The problem can be avoided b) providing that each instructor work 
with equal numbers of students in each group of the stud). If this is done, the variable of individual 
differences among instructors will be balanced and as long as the instructors follow haJv agreed upon 
practices, the) are free to explain issues and train according to their own personal techniques that the) 
have developed and found effective for their own particular style. 

The Same Instructor; Simulator and Aircraft 

It is important that the same instructor train the experimental student in both the simulator and the 
aircraft. This practice is likel) to facilitate the effort to arrive at maximum transfer effects. The instructor 
*ho has done the simulator pretraining will have the best possible understanding of the individual 
student's strong and wjtak points, being able to estimate what that student did and did not learn during 
pre,! rain iiii?, and being able to use that knowledge to the best advantage during retraining in the aircraft. 
Immediately prior to an exercise in the aircraft, the instructor can review important issues with the 
student, refreshing the student's mentor) of particular performances in the simulator and mentioning 
significant differences that exist between the simulated and airborne environments. 

^ 

X. 1 HE SIXTH STEP: PUNNING FOR SUFFICIENT STUDY TIME 

It is very eas) to overlook tlie issue of planning for a study syllabus of sufficient duration that all 
students will have a reasonable amount of time in which to arrive at an end performance criterion 
(experimental students in tin simulator and all students in the aircraft). Failure to provide sufficient time 
can result in data of the stud) being attenuated — not all students' performances figuring into analyses. In 
the worst tase. no students would arrive at performance criterion — the study being a total failure or else 
transfer estimates being dependent on a grading process. The point is. of course, that individual students 
simply are likel) to leant at different rates, requiring different amounts of time to arrive at performance 
criterion. < 

The cited stud) concerned with approaches and landings (Payne ct al.. 1954) ran into a problem as 
students were in the final phase of making landings in the aircraft. Students, drawn from an Air Force 
Reserve Officers Training Corps (ROTC) program, were Hearing lauding performance criterion when 
their semester ended, and they had to go away. Only 8 of the 12 students met the landing criterion. 
Fortunately, four of these were in the control group and four were in the experimental group, permitting a 
reasonable and balanced estimate of transfer. 

The cited stud) concerned with air combat maneuvering (Northrop. 1976) had to be conducted using 
an operational training syllabus of such short duration that the use of a trials-lo-critcrion measure was not 
possible. In that case, the problem was recognized before the fact, with performance measurement 
consisting of instructors' grades in lieu of trials-to-critcrion. While that permitted reasonable estimates of 
percent transfer of learning, it was not possible to arrive at estimates of a transfer effectiveness ratio. A 
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form of transfer effectiveness estimate might have been feasible had the s)llabus been of sufficient length 
that an instructor could have shortened (or omitted altogether) portions of a student's mission segments 
when, in the instructor's judgment, goodness of performance warranted such action. Even that. howc\er. 
was not possible. Instructor pilots had pointed out. before the fact, that thes)llabus was too short to permit 
a sufficientl) high let el of learning to jiistif) an) omission of sWIabus items. And since that >\llabus was 
set by operational training rules, it could not be adjusted. 

Estimates of Necessary Performance Time 

In designing the stud), the goal would be to provide sufficient tinu for the least apt student (in either 
or an) group) to complete the work and to arrive at an end performance criterion. Prel human testing 
would appear to he the best means of estimating necessar) time because tasks, their degrees of rclatnc 
difficult), associated performance criteria, and t)pes of student* can be quite different from tud> to 
stud). Even use of prelimiiiar) testing might not provide a complete answer, considering that onl) small 
numbers of students are likel) to be involved. But since the consequences of too little a> ailable time can 
be serious, resulting estimates might have to be padded. It is far better to allow too much time than too 
little. 

XI. TUB SEVENTH STEP: AVOIDANCE OF MUTANT FACTORS 

"Dilutant factors" are defined here as practices that can prevent demonstration of maximum possible 
transfer effects of a stud). The concern here is with two dilutant factors that are not ncccssaril) mutuall) 
exclusive. 

Avoid Time Delays Between Simulator Pretrainiiig and 
Retraining in Aircraft 

While the severit) of the problem of time dela\s between the simulator pretrainiiig and the 
retraining in the aircraft ma\ be dependent on the nature of the specific stud), the issue would appear to 
'be highl) critical for tasks that are "volatile" in nature— tasks involving skills highl) subject to deca\ in 
the; absence of practice. This may be illustrated in terms of the study concerned with air combat 
maneuvering (Northrop. 1976). In that stud), unavoidable scheduling restrictions required that 
experimental students he pretrained in the tmiinkJor on a massed basis during a 5 da) period, moving to 
work in the aircraft onl) after completion of that block of simulator work. For a number of reasons, 
including the facts that the simulator was located more than 100 miles from the airbase. the press of work 
of the operational training schedule at that airbase. student loadings, shortages of instructors, mechanical 
difficulties with aircraft, weather, and interruptions of training schedules because of priorities. dcla\> 
between simulator pretrainiiig and retraining in aircraft were as long as \ weeks. The principal priority 
causing interruption of the schedule involved availabilit) of aircraft carriers for qualification training. 
Carriers became available onl) infrcqnentl) and had to be used immediately Observation of goodness of 
performance in the simulator and resulting transfer effect estimates suggested rather strongly that there 
was a clear and strong dilutant effect. « 

Instructor pilots who conducted the study noted that skills of air combat maneuvering arc quite 
volatile in the sense that periods of inactivity of as much as 10 days resulted in noticeable decrements m 
their own performances. It takes little imagination to estimate the performance decrement for student 
pilots who had completed the simulated equivalent of only six flights in this context. 

Pretrain Using the Simulator in Meaningful Blocks of Tasks 

Precisely what a "meaningful block of tasks" might be would depend on the context of the particular 
transfer study. But again, the issue may be illustrated best in terms of the air combat maneuvering ; study 
(Northrop, 1976). Experimental students were pretrained in the simulator for the first 6 flights ol a I /- 



22 

28 



i" 



flight air "on i hat syllabus used in the operational training environment— onl) those ftr.^t 6 flights figuring 
into the transfer Mud). The flight* were designed to acquaint the student with task* of air combat 
maneuvering in a sequential order, beginning with basics and progressing to engagement exercises of 
increasingly difficult nature. The initial flight was, for familiarization and involved onl> a .single a in raft. 
Subsequent flights introduced eight basic maneuvers of the total syllabus, with the difficult) of combat 
engagements being increased. The instructor. flying the "adversary aircraft" during two-aircraft 
exercises, began b) presenting a relativel) eas) "mark." but increased the complexity of the performance 
to the point that, by the sixth flight, the student was "fighting" a relativel) skilled "opponent/" 

Once the simulated equivalents of those six flights had been completed, the experimental students 
moved to the airhasc and began the normal training syllabus as used in the operational squadron. It cm 
onl) be surmised that pretraining in this blocked manner may have been less than optimally effective in 
terms of transfer of learning. It seems highl) likel) that had the experimental students been pretraiued lor 
each individual flight and retrained in the aircraft for that flight, the resulting transfer subsequently 
estimates might have been considerably greater. 

It can be reported onl) on the basis of personal observation that resulting transfer estimates seemed 
far lower than might have been expected w ithout the compounding effects of these two dilutaut factors, 
(a) dela) between simulator pretraining and aircraft retraining and (b) massed training of the sort 
described. In an) event, the lesson seems clear. If a transfer stud) makes use of clearl) functional blocks 
of simulator pretraining. moving experimental students to the aircraft as soon as possible, the reMiltiug 
transfer effects should be augmented. 

Colocation of the Simulator at the Site of Airborne Training 

Probably the best wa) to prevent dela)s between simulator prctraiumg and aircraft retraining would 
he to locate the simulator at the airbase to be used in the stud). Even if this is possible, however, proper 
scheduling would still be critical. But in the event that the simulator must be located elsewhere, even 
attempt should be made to transport experimental siudenu to the airbase after the) ha\e completed 
logical blocks of simulator pretraining— getting them into the ait at the earliest feasible times. The 
problem and the solution are eas) to state. Expediting the solution must depend on aspects of the 
particular study. 

XII. THE EIGHTH STEP: IMPORTANCE OF SCHEMLING IN ADVANCE 

^ The issue cannot be emphasized too heavil). During earl) phases of planning, the researc h team 
should begin to assess potential scheduling problems and should consider these on an iterative basis as 
final plans lake shape. Even prior to testing the stud) method, a detailed schedule should be prepared, 
taking into account times for involvement of students, instructors, simulators, and airiraft. This must not 
be left to chance. 

Cooperation of the unit commander and the unit operations officer will be critical to development 
and enforcement of the schedule*, and here as before, the instructors working in the stud) should be able 
to help achieve such cooperation. 

Means must be found for preventing v isitors from interfering with scheduled stud) work. Experience 
has shown clearl) that this can be a serious problem. Perhaps it can be solved best through orders issued 
by pertinent unit commanders. The problem tends to be most severe during simulator pretraining. 
Simulators— particularly those of elegant nature- tend to attract visitors frequently . If the environment 
permits, it may be possible to provide for a spectator vantage point that does not interfere with training 
work. Above all. neither the student nor the instructor should be aware of the presence of visitors, 
especially when those visitors are of high rank. 
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XIII. THE NINTH STEP: PLAN FOR RUNNING THE STUDY IN THE MIDST OF A BUSY 
OPERATIONAL TRAINING ENVIRONMENT 

If the study is lo be conducted in the midst cf a busy operational training environment the 
cooperation and support of the unit commander and of the unit operations officer arc required on the one 
hand, and planning for minimum interference with the operational training program is required on the 
other hand. While it would be highly desirable to be able to run these kinds of studies using a dedicated 
facility, it seems more likely that they will have to make use of operational facilities. 

Cooperation and Support of the Unit Commander and 
the Unit Operations Officer 

It is easy for the researcher to lose sight of the fact that the operational people have their own 
problems, and at best, cooperation with a study effort could be simply an additional annoyance. It may he 
that major objections can be avoided by making the unit commander and the operations officer parties to 
the purpose of and planning for the study from the outset. While it might be tempting for the researcher to 
rely on orders from higher authority — these directing the unit Commander to support the research work- 
it takes little imagination to see that this can be a serious mistake. The research team would he wise to 
work with the operational people from the very beginning, persuading thcin of the importance of the 
study and getting their professional inputs for planning the effort. The instructors can play essential roles 
here, having close professional ties with the operational unit people. In many cases, preparatory work here 
can make or break the study. 

Planning for Minimum Interference with the Operational Schedule 

The research team, working with the operational people, should develop a clear set of plans for 
preventing all but absolutely necessary interference with the operational work. The interference may 
consist principally of time required for simulator pretraining of experimental students, but the nature of 
the stud) ma) impose still other requirements, to include modified routines during airborne work, use of 
research instructors, balancing instructors' work with experimental and control students, and special 
student briefings and debriefings. But if proper rapport, cooperation, and support have been established 
at the outset, it should be possible to solve various problems to everyone's satisfaction. There is no way to 
overemphasize the importance of these issues. The process of solving potential problems involves a lot of 
planning and work but it is critical for the success of the study. Appropriate members of the research team 
should remain in constant touch with the operational people for the duration of the study. 

XIV, THE TENTH STEP: TESTING THE STUDY METHOD BEFORE 
TAKING FINAL DATA 

In the past the process of testing the study method before taking the final data has been called ^ 
"pretesting," That label tends to be slightly misleading, however, being confused with the process of early 
and preliminary testing of issues that arc to be the basis for the transfer study. In any event, the process 
should consist of what amounts to a small dress rehearsal conducted before the actual study begins, the 
effort being an attempt to discover method problems that had not been predicted earlier. 

As in other types of research, testing the study method is essential. It is indeed rare that all problems 
are predicted, regardless of the amount of care that has been devoted to the plan. Such method testing 
should be conducted sufficiently early to provide the research team with adequate time to make last 
minute fixes or corrections. Frequently the method testing process need use only a very few students who 
go through the entire course of the planned study. Possibly greater emphasis should be placed on routines 
involving experimental students: although routines for control students must not be ignored. 

A problem may involve availability of students in sufficient numbers to conduct both the method 
testing work and the actual study. Depending on the number and severity of method piohlenis discovered 
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(with changes bring required for routines of the actual study), it is generally a good idea to provide that 
performance data from students used in method testing are not included with data from students in the 
factual study. Thus, the problem is one of not using too many of the limited number of students who are ol 
a slightly different nature than those to be used in the actual stud), although truly severe differences 
could pose a real problem. As is the case with many otlier issues for these transfer studies, the research 
team will have to exercise considerable imagination and judgment ulieii and if the student scarcity 
problem is encountered. 

XV. THE ELEVENTH STEP: ANALYSIS OF RESULTS 

While the details of the data analyses will depend on the nature of the specific transfer stud), a few 
observations can be made that should apply to man) types of studies. As has been suggested. Iraiisfer 
studies should be concerned with reasonably substantial performance differences between or among 
groups of experimental and control students— differences that could have practical meaning. 
Interpretation of findings of a study should not be based solel) on probability (p) levels associated with 
inferential tests for statistical significance because those p levels simpl) do not tell the entire stor). 

It is recommended that the first step of the analysis involve placing the raw performance data in one 
or more display formats that facilitate inspection. Inspection of those data should be made before, during, 
and subsequent to running inferential tests of interest. Such an inspection can perform several valuable 
services. First, if large performance differences exist, the) will be evident by simply looking at the dala. 
\n inspection should he directed toward looking for both large group performance mean differences and 
variation of performances within the various groups. If performance variations quite large, the use of 
arithmetic means to describe group performances is not entirely satisfactor) without additional 
descriptors. For example, a large standard deviation for an array of values indicates that the arra) mean 
should not be laken loo seriousl). The wide variation of ihe individual values likely has considerable 
meaning that should be explored. Second, inspection of the raw data display formats during and after 
running statistical inferential tests will permit an understanding of the results of th<?«c tests. 

As the data are analyzed using inferential testy, the results of those tests— as in an analysis-of- 
variance summary table— should be cross-compared with the raw data display formats, again with the 
understanding that probability levels do not tell the entire story. In conjunction with an analysis of 
variance summary, for example, it is highly useful to derive estimates of strengths of associations, such as 
simple values of eta squared or estimated omega squared. (For a discussion of the estimated omega 
squared statistic, see Hay s. 1973. pp 484-488. 512-513). Perhaps the easiest way to see how these statistics 
are of value involves the descriptive eta squared (estimated omega squared being its : liferent ial 
counterpart). Simply divide each of the sums of squares for main effects, interactions, and error by the 
total sum of squares, arriving at estimates of proportions of total variation that are accounted for by each. 
If eta squared for error is large, attention is directed to the variation of individual students" scores within 
arrays of the display of raw values, where it wil! be seen that there is not a great deal of uniformity of 
performances within those arrays. This finding would indicate that any statistically significant transfer 
effect should not be taken too seriously; i.e.. the differences among student performances are more 
marked than differences among group means. 

On the otlier hand, if the greater proportion of variation is associated with. say. main effects or 
interaction effects, i.e., the values of eta squared are relatively large, an inspection of the raw data will 
show that performance within arrays is reasonably uniform and that iiieau-diffcreuccs among groups, 
which are of principal interest, represent strong effects. In other words, the larger the estimate of strength 
of association for main effects or interaction effects, the more credible are the results— p levels 
notwithstanding. 

While it is unfortunate that many available computer programs do not provide for calculation of 
these values of strength of association, it is a relatively easy matter to calculate theiu "by hand" or to 
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provide that sini|de subroutines he added to those programs to present this criticall) important 
information. 

In ending this, discussion, it should he noted that, within limits, undue concern with underl)ing 
assumptions* of parametric tests is incorrect, as is the insistence that parametritis he used oulv with data 
associated with interval or ratio scales. These fallacies take aw a) the researcher's most powerful and 
versatile inferential too[s. The notion of "robustness" of para metrics in terms of departures from 
assumptions of normalit) and liomoscedastieitv . careful interpretations of the assumption of data 
independence, and scales of measurement is discussed by Bakcf. Hard)tk. & Pctriiiovick. 1970.. Boncau. 
I%0. Burke. 1953. Havs. 1973; and Lord. 1953. The excessive use of nonparametric tests also is to 

be avoided because these tests tend to throw aw a) large portions of the data and. in general, are 
characterized b} relatively low power (e.g.. they might not reject a false null hvpothcsis). 

XVI. SOME CLOSING ItEMAKKS 

The goal of studies of transfer of learning is to provide information about techniques or equipment, 
.the use of which can serve as guides for designing or updating training curricula. The likelihood that the 
information will be used depends on the extent to which both stud) method and results are coin hieing to 
the personnel responsible for operational training. Studies demonstrating large performance effects 
resulting from simulator pretraining certaiiil) will be the most convincing and. other things being equal, 
will be the most likel) to result in the \ue of experimental techniques or equipment during operational 
training. 

This report has discussed a number of issues concerned with resean h methods, with emphasis on the 
need for careful planning. It has addressed definitions of the problem and the task, considerations of 
students, instructors, performance measurement, time requirements, clilutant factors, scheduling, the 
bus) operational environment, method testing, and analvsis of results. These 1 issues provide the means b) 
wliic h the resoarc her can attempt io conduc t a stud) illustrating the maximum possible transfer estimate 
for the task at hand, illustrating for the operational instructor what can be accomplished. 

It is hoped that the researcher, viewing all of these issues in the aggregate, will not arrnc at the 
unfortunate conclusion that it is virtual!) impossible to run a trui) effective stud) of transfc of learning. 
Certainl) no single stud) is likel) to be able to observe all of the issues in their absolute form. But to the 
extent that a great man) issues are taken into account, to that same extent the transfer stud) is likel) to 
provide sound and useful results of benefit to the operational training community. 
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